141 research outputs found

    How to Complete an Interactive Configuration Process?

    Full text link
    When configuring customizable software, it is useful to provide interactive tool-support that ensures that the configuration does not breach given constraints. But, when is a configuration complete and how can the tool help the user to complete it? We formalize this problem and relate it to concepts from non-monotonic reasoning well researched in Artificial Intelligence. The results are interesting for both practitioners and theoreticians. Practitioners will find a technique facilitating an interactive configuration process and experiments supporting feasibility of the approach. Theoreticians will find links between well-known formal concepts and a concrete practical application.Comment: to appear in SOFSEM 201

    FastqPuri: high-performance preprocessing of RNA-seq data

    Get PDF
    Background RNA sequencing (RNA-seq) has become the standard means of analyzing gene and transcript expression in high-throughput. While previously sequence alignment was a time demanding step, fast alignment methods and even more so transcript counting methods which avoid mapping and quantify gene and transcript expression by evaluating whether a read is compatible with a transcript, have led to significant speed-ups in data analysis. Now, the most time demanding step in the analysis of RNA-seq data is preprocessing the raw sequence data, such as running quality control and adapter, contamination and quality filtering before transcript or gene quantification. To do so, many researchers chain different tools, but a comprehensive, flexible and fast software that covers all preprocessing steps is currently missing. Results We here present FastqPuri, a light-weight and highly efficient preprocessing tool for fastq data. FastqPuri provides sequence quality reports on the sample and dataset level with new plots which facilitate decision making for subsequent quality filtering. Moreover, FastqPuri efficiently removes adapter sequences and sequences from biological contamination from the data. It accepts both single- and paired-end data in uncompressed or compressed fastq files. FastqPuri can be run stand-alone and is suitable to be run within pipelines. We benchmarked FastqPuri against existing tools and found that FastqPuri is superior in terms of speed, memory usage, versatility and comprehensiveness. Conclusions: FastqPuri is a new tool which covers all aspects of short read sequence data preprocessing. It was designed for RNA-seq data to meet the needs for fast preprocessing of fastq data to allow transcript and gene counting, but it is suitable to process any short read sequencing data of which high sequence quality is needed, such as for genome assembly or SNV (single nucleotide variant) detection. FastqPuri is most flexible in filtering undesired biological sequences by offering two approaches to optimize speed and memory usage dependent on the total size of the potential contaminating sequences. FastqPuri is available at https://github.com/jengelmann/FastqPuri. It is implemented in C and R and licensed under GPL v3

    Substantial biases in ultra-short read data sets from high-throughput DNA sequencing

    Get PDF
    Novel sequencing technologies permit the rapid production of large sequence data sets. These technologies are likely to revolutionize genetics and biomedical research, but a thorough characterization of the ultra-short read output is necessary. We generated and analyzed two Illumina 1G ultra-short read data sets, i.e. 2.8 million 27mer reads from a Beta vulgaris genomic clone and 12.3 million 36mers from the Helicobacter acinonychis genome. We found that error rates range from 0.3% at the beginning of reads to 3.8% at the end of reads. Wrong base calls are frequently preceded by base G. Base substitution error frequencies vary by 10- to 11-fold, with A > C transversion being among the most frequent and C > G transversions among the least frequent substitution errors. Insertions and deletions of single bases occur at very low rates. When simulating re-sequencing we found a 20-fold sequencing coverage to be sufficient to compensate errors by correct reads. The read coverage of the sequenced regions is biased; the highest read density was found in intervals with elevated GC content. High Solexa quality scores are over-optimistic and low scores underestimate the data quality. Our results show different types of biases and ways to detect them. Such biases have implications on the use and interpretation of Solexa data, for de novo sequencing, re-sequencing, the identification of single nucleotide polymorphisms and DNA methylation sites, as well as for transcriptome analysis

    a.SCatch: semantic structure for architectural floor plan retrieval

    Get PDF
    Architects’ daily routine involves working with drawings. They use either a pen or a computer to sketch out their ideas or to do a drawing to scale. We therefore propose the use of a sketch-based approach when using the floor plan repository for queries. This enables the user of the system to sketch a schematic abstraction of a floor plan and search for floor plans that are structurally similar. We also propose the use of a visual query language, and a semantic structure as put forward by Langenhan. An algorithm extracts the semantic structure sketched by the architect on DFKI’s Touch& Write table and compares the structure of the sketch with that of those from the floor plan repository. The a.SCatch system enables the user to access knowledge from past projects easily. Based on CBR strategies and shape detection technologies, a sketch-based retrieval gives access to a semantic floor plan repository. Furthermore, details of a prototypical application which allows semantic structure to be extracted from image data and put into the repository semi-automatically are provided

    EasyStrata: evaluation and visualization of stratified genome-wide association meta-analysis data.

    Get PDF
    The R package EasyStrata facilitates the evaluation and visualization of stratified genome-wide association meta-analyses (GWAMAs) results. It provides (i) statistical methods to test and account for between-strata difference as a means to tackle gene-strata interaction effects and (ii) extended graphical features tailored for stratified GWAMA results. The software provides further features also suitable for general GWAMAs including functions to annotate, exclude or highlight specific loci in plots or to extract independent subsets of loci from genome-wide datasets. It is freely available and includes a user-friendly scripting interface that simplifies data handling and allows for combining statistical and graphical functions in a flexible fashion. AVAILABILITY: EasyStrata is available for free (under the GNU General Public License v3) from our Web site www.genepi-regensburg.de/easystrata and from the CRAN R package repository cran.r-project.org/web/packages/EasyStrata/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online

    EST-PAC a web package for EST annotation and protein sequence prediction

    Get PDF
    With the decreasing cost of DNA sequencing technology and the vast diversity of biological resources, researchers increasingly face the basic challenge of annotating a larger number of expressed sequences tags (EST) from a variety of species. This typically consists of a series of repetitive tasks, which should be automated and easy to use. The results of these annotation tasks need to be stored and organized in a consistent way. All these operations should be self-installing, platform independent, easy to customize and amenable to using distributed bioinformatics resources available on the Internet. In order to address these issues, we present EST-PAC a web oriented multi-platform software package for expressed sequences tag (EST) annotation. EST-PAC provides a solution for the administration of EST and protein sequence annotations accessible through a web interface. Three aspects of EST annotation are automated: 1) searching local or remote biological databases for sequence similarities using Blast services, 2) predicting protein coding sequence from EST data and, 3) annotating predicted protein sequences with functional domain predictions. In practice, EST-PAC integrates the BLASTALL suite, EST-Scan2 and HMMER in a relational database system accessible through a simple web interface. EST-PAC also takes advantage of the relational database to allow consistent storage, powerful queries of results and, management of the annotation process. The system allows users to customize annotation strategies and provides an open-source data-management environment for research and education in bioinformatics

    The Metalloprotease Meprinβ Processes E-Cadherin and Weakens Intercellular Adhesion

    Get PDF
    BACKGROUND: Meprin (EC 3.4.24.18), an astacin-like metalloprotease, is expressed in the epithelium of the intestine and kidney tubules and has been related to cancer, but the mechanistic links are unknown. METHODOLOGY/PRINCIPAL FINDINGS: We used MDCK and Caco-2 cells stably transfected with meprin alpha and or meprin beta to establish models of renal and intestinal epithelial cells expressing this protease at physiological levels. In both models E-cadherin was cleaved, producing a cell-associated 97-kDa E-cadherin fragment, which was enhanced upon activation of the meprin zymogen and reduced in the presence of a meprin inhibitor. The cleavage site was localized in the extracellular domain adjacent to the plasma membrane. In vitro assays with purified components showed that the 97-kDa fragment was specifically generated by meprin beta, but not by ADAM-10 or MMP-7. Concomitantly with E-cadherin cleavage and degradation of the E-cadherin cytoplasmic tail, the plaque proteins beta-catenin and plakoglobin were processed by an intracellular protease, whereas alpha-catenin, which does not bind directly to E-cadherin, remained intact. Using confocal microscopy, we observed a partial colocalization of meprin beta and E-cadherin at lateral membranes of incompletely polarized cells at preconfluent or early confluent stages. Meprin beta-expressing cells displayed a reduced strength of cell-cell contacts and a significantly lower tendency to form multicellular aggregates. CONCLUSIONS/SIGNIFICANCE: By identifying E-cadherin as a substrate for meprin beta in a cellular context, this study reveals a novel biological role of this protease in epithelial cells. Our results suggest a crucial role for meprin beta in the control of adhesiveness via cleavage of E-cadherin with potential implications in a wide range of biological processes including epithelial barrier function and cancer progression

    Metalloprotease Meprinβ in Rat Kidney: Glomerular Localization and Differential Expression in Glomerulonephritis

    Get PDF
    Meprin (EC 3.4.24.18) is an oligomeric metalloendopeptidase found in microvillar membranes of kidney proximal tubular epithelial cells. Here, we present the first report on the expression of meprinβ in rat glomerular epithelial cells and suggest a potential involvement in experimental glomerular disease. We detected meprinβ in glomeruli of immunostained rat kidney sections on the protein level and by quantitative RT-PCR of laser-capture microdissected glomeruli on the mRNA level. Using immuno-gold staining we identified the membrane of podocyte foot processes as the main site of meprinβ expression. The glomerular meprinβ expression pattern was altered in anti-Thy 1.1 and passive Heymann nephritis (PHN). In addition, the meprinβ staining pattern in the latter was reminiscent of immunostaining with the sheep anti-Fx1A antiserum, commonly used in PHN induction. Using Western blot and immunoprecipitation assays we demonstrated that meprinβ is recognized by Fx1A antiserum and may therefore represent an auto-antigen in PHN. In anti-Thy 1.1 glomerulonephritis we observed a striking redistribution of meprinβ in tubular epithelial cells from the apical to the basolateral side and the cytosol. This might point to an involvement of meprinβ in this form of glomerulonephritis

    Constraint solving in uncertain and dynamic environments - a survey

    Get PDF
    International audienceThis article follows a tutorial, given by the authors on dynamic constraint solving at CP 2003 (Ninth International Conference on Principles and Practice of Constraint Programming) in Kinsale, Ireland. It aims at offering an overview of the main approaches and techniques that have been proposed in the domain of constraint satisfaction to deal with uncertain and dynamic environments
    corecore